CV - Module Project Part-1 Face Detection

  • DOMAIN: Entertainment

  • CONTEXT:
Company X owns a movie application and repository which caters movie streaming to millions of users on a subscription basis. Company wants to automate the process of displaying cast and crew information in each scene from a movie such that when a user pauses on the movie and clicks on cast information button, the app will show details of the actor in the scene. Company has an in-house computer vision and multimedia experts who need to detect faces from screen shots from the movie scene.

  • DATA DESCRIPTION:
• DATA DESCRIPTION: The dataset comprises of images and its mask where there is a human face. [Source]

  • PROJECT OBJECTIVE:

Face detection from training images.

  • Import the dataset

  • We can see that the train dataset is comprised of 409 rgb images/screenshots from various movies; We will have to reshape them to a common size during feature creation to be able to use in Models.

  • Create features (images) and labels (mask) using the data

  • Split the dataset into Train-Test Cuts

  • Mask detection model

  • Design a face mask detection model: Using U-net along with pre-trained transfer learning models

  • Hence, the model has 10,258,689 (~10M) trainable parameters

  • Design own Dice Coefficient and Loss function

image.png

  • Train, tune and test the model

  • Evaluate the model using testing data

  • We can see that the model has a test (out of sample) loss of 0.614 and Dice Coefficient of 0.757. As we have trained the model on very few images, Some more data collection for these images or using a separate model for these might help boost the final metrics further.

  • Pickle the best model for future use

  • Use the “Prediction image” as an input to your designed model and display the output of the image.

  • Our Objective was to train a face detection model using Transfer Learning and on top of it use the UNET layers to train, fit and evaluate model which gives the bboxes or the mask around all the faces in an image fed to it. We have obtained a Dice Coefficient of 0.759

  • Also, from the learning curve of Loss through epochs, we can clearly see that both losses have converged and the model is not overfitting.

  • As seen in the final test Image, the model does a pretty good job of detecting faces in images. But, could be even better with more training data.

  • Metrics:
    Train (In-Sample): Loss: 0.169, Dice Coefficient: 0.895
    Test (Out-Sample): Loss: 0.491, Dice Coefficient: 0.759